graph encoder
Co-ModalityGraphContrastiveLearning forImbalanced NodeClassification-Appendix
InCM-GCL, we can either takethe textfeaturexT orthe image featurexI asthe content feature, and consider the corresponding text encoderfT or image encoderfI as the content encoder. In this section, we discuss the settings of baseline models for imbalanced node classification over fourgraphs. G1: We convert the rich text content into the bag-of-words feature vectors, and further feed the feature vectors with different imbalance ratios to a two-layer MLP [7] classifier to get the classification results. For AMiner, YelpChi, and GitHub graph datasets, we implement CHI-Square [11]toselect useful feature words. G2: We implement three graph neural network based representation learning models including GCN [5], GAT [9], and GraphSAGE [2] to learn the node embeddings by leveraging both node feature (bag-of-words feature vector) andgraph structure information.
Contents in Appendices: In Appendix A, we describe each of the components in GA T A in detail
In Appendix A, we describe each of the components in GA T A in detail. In Appendix B, we provide detailed information on how we pre-train GA T A's graph updater GA T A. Since the action scorer module is the same as in GA T A, this appendix elaborates on In Appendix D, we provide additional results and discussions. In Appendix F, we show examples of graphs in TextWorld games. As briefly mentioned in Section 3.3, GA T A utilizes a graph encoder which is based on R-GCN [ The number of bases we use is 3. 14 A.2 T ext Encoder In the block, each convolutional layer has 64 filters, each kernel's size is 5. Following standard transformer training, we add positional encodings into each block's The representation aggregator aims to combine the text observation representations and graph representations together. The scorer consists of a self-attention layer, a masked mean pooling layer, and a two-layer MLP .
A Systematic Study of Model Extraction Attacks on Graph Foundation Models
Xu, Haoyan, Qian, Ruizhi, Li, Jiate, Dong, Yushun, Lin, Minghao, Yan, Hanson, Yao, Zhengtao, Liu, Qinghua, Dong, Junhao, Huang, Ruopeng, Zhao, Yue, Li, Mengyuan
Graph machine learning has advanced rapidly in tasks such as link prediction, anomaly detection, and node classification. As models scale up, pretrained graph models have become valuable intellectual assets because they encode extensive computation and domain expertise. Building on these advances, Graph Foundation Models (GFMs) mark a major step forward by jointly pretraining graph and text encoders on massive and diverse data. This unifies structural and semantic understanding, enables zero-shot inference, and supports applications such as fraud detection and biomedical analysis. However, the high pretraining cost and broad cross-domain knowledge in GFMs also make them attractive targets for model extraction attacks (MEAs). Prior work has focused only on small graph neural networks trained on a single graph, leaving the security implications for large-scale and multimodal GFMs largely unexplored. This paper presents the first systematic study of MEAs against GFMs. We formalize a black-box threat model and define six practical attack scenarios covering domain-level and graph-specific extraction goals, architectural mismatch, limited query budgets, partial node access, and training data discrepancies. To instantiate these attacks, we introduce a lightweight extraction method that trains an attacker encoder using supervised regression of graph embeddings. Even without contrastive pretraining data, this method learns an encoder that stays aligned with the victim text encoder and preserves its zero-shot inference ability on unseen graphs. Experiments on seven datasets show that the attacker can approximate the victim model using only a tiny fraction of its original training cost, with almost no loss in accuracy. These findings reveal that GFMs greatly expand the MEA surface and highlight the need for deployment-aware security defenses in large-scale graph learning systems.